Questioned Electronic Documents : Empirical Studies in Authorship Attribution
نویسنده
چکیده
Forensic analysis of questioned electronic documents is very difficult, because the nature of the documents eliminates many kinds of informative differences. Recent work in authorship attribution demonstrates the practicality of analyzing documents based on authorial style, but the state of the art is confusing. Analyses are difficult to apply, little is known about type or rate of errors, and no “best practices” are available. We present the results of some recent experiments and software development to address these issues, partly through the development of a systematic testbed for multilingual, multigenre authorship attribution accuracy, and partly through the development and concurrent analysis of a uniform and portable software tool that applies multiple methods to analyze electronic documents for authorship based on authorial style.
منابع مشابه
Authorship, Practical Authorship and Documentary Boundary Objects in Archaeological Information Work
On the basis of an empirical investigation of archaeological information work, this paper discusses the interplay of authorship of documents and documentary boundary objects, and the practical authorship of social situations and identities and how a closer look at the authorship (as understood in the contemporary authorship literature) can be helpful in elaborating our understanding of the maki...
متن کاملClustering by Authorship Within and Across Documents
The vast majority of previous studies in authorship attribution assume the existence of documents (or parts of documents) labeled by authorship to be used as training instances in either closed-set or open-set attribution. However, in several applications it is not easy or even possible to find such labeled data and it is necessary to build unsupervised attribution models that are able to estim...
متن کاملDetermining Writership of Historical Manuscripts using Computational Methods
The role of computational methods in the determination of authorship of historical manuscripts is considered. During the last few years the computational forensics community has developed automation tools for forensic document examination, in particular for determining whether a given handwriting specimen can be attributed to known writing. We describe how these methods can be used with histori...
متن کاملAuthorship Attribution Using Word Network Features
In this paper, we explore a set of novel features for authorship attribution of documents. These features are derived from a word network representation of natural language text. As has been noted in previous studies, natural language tends to show complex network structure at word level, with low degrees of separation and scale-free (power law) degree distribution. There has also been work on ...
متن کامل